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Abstract 

This paper presents an embedded security sublan¬ 
guage for enforcing information-flow policies in the stan¬ 
dard Haskell programming language. The sublanguage 
provides useful information-flow control mechanisms in¬ 
cluding dynamic security lattices, run-time code privi¬ 
leges and declassification, without modifying the base 
language. This design avoids the redundant work of pro¬ 
ducing new languages, lowers the threshold for adopting 
security-typed languages, and also provides great flex¬ 
ibility and modularity for using security-policy frame¬ 
works. 

The embedded security sublanguage is designed using 
a standard combinator interface called arrows. Computa¬ 
tions constructed in the sublanguage have static and ex¬ 
plicit control-flow components, making it possible to im¬ 
plement information-flow control using static-analysis tech¬ 
niques at run time, while providing strong security guar¬ 
antees. This paper presents a concrete Haskell implemen¬ 
tation and an example application demonstrating the pro¬ 
posed techniques. 

1. Introduction 

Language-based information-flow security has a long, 
rich history with many (mostly theoretical) results [11]. 
This prior work has focused mainly on the problems of 
static program analysis for a wide variety of computation 
models and policy features. Often these analyses are pre¬ 
sented as type systems whose soundness is justified by 
some form of noninterference result. The approach is com¬ 
pelling because programming-language techniques can be 
used to specify and enforce security policies that cannot be 
achieved by conventional mechanisms such as access con¬ 
trol and encryption. Two full-scale language implementa¬ 
tions have been developed: Jif [6, 4] by Myers et al. is a 
variant of Java, and Flow Caml [13, 10] by Simonet et al. is 
an extension of Caml. 

However, despite this rather large (and growing!) 
body of work on language-based information-flow secu¬ 
rity, there has been relatively little adoption of the proposed 
techniques—two success stories are the “taint-checking 
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mode” available in the Perl language and the use of 
information-flow analysis to separate low-integrity com¬ 
ponents from high integrity components built in the 
SparkAda [3] language. 

One important reason why these domain-specific 
security-typed languages have not been widely ap¬ 
plied is that, because the information-flow policies are 
intended to apply in an end-to-end fashion, the whole sys¬ 
tem has to be written in the new language. However, it is 
expensive to build large software systems in a new lan¬ 
guage. Doing so can be justified only if the benefit of us¬ 
ing the new language outweighs the cost of migrating to the 
new language—including the costs of retraining program¬ 
mers and the time and expense necessary to port existing 
libraries and other code bases. 

Moreover, in practice, it may well be the case that only 
a small part of the system (maybe only a few variables 
in a large program) has information-flow security require¬ 
ments. Although the system may be large and complex, 
the secret information flow in the system may not be com¬ 
pletely unmanageable. In such cases, it is probably more 
convenient to use the programming language that best fits 
the primary functionality of the system rather than its se¬ 
curity requirements, and manage security issues by tradi¬ 
tional means such as code auditing and careful software en¬ 
gineering practices. There is a language adoption threshold 
based on the ratio of security requirements to functional¬ 
ity requirements, and this threshold is very high. 

1.1. Embedded security sublanguages 

This paper presents a different approach to enforcing 
information-flow security policies. Rather than producing 
a new language from scratch, we show how to encode tradi¬ 
tional information-flow type systems using the general fea¬ 
tures of existing, modern programming languages. In partic¬ 
ular, we show how the abstract datatype and typeclass fea¬ 
tures found in the general-purpose language Haskell can be 
used to build a module that effectively provides a security- 
typed sublanguage embedded in Haskell itself. This sublan¬ 
guage can interoperate smoothly with existing Haskell code 
while still providing strong information-flow security guar- 




antees. Importantly, we do not need to modify the design 
or implementation of Haskell itself—we use features in its 
standard (but advanced) type system. 

Our approach eliminates the adoption threshold for sys¬ 
tems implemented in Haskell: such systems can be made 
more secure without completely rewriting them in a new 
language. The implementation can be a fine-grained mix¬ 
ture of normal code and security-hardened code (variables, 
data and computations over secure data). The programmer 
needs to protect only sensitive data and computation using a 
software library, which enforces the information-flow poli¬ 
cies throughout the entire system and provides end-to-end 
security goals like noninterference. 

Another benefit of our approach is flexibility. A special¬ 
ized language like Jif must pick a fixed policy framework in 
which the security policies are expressed. Considering the 
plethora of possible features present in the literature with 
regards to how to express the label lattice, declassification 
options [12], dynamic policy information [14, 17], etc., it is 
unlikely that any particular choice of policy language will 
be suitable for all programs with security concerns. By con¬ 
trast, since it is much easier to build a library module than 
to build a new language, it is conceivable that different pro¬ 
grams would choose to implement entirely different policy 
frameworks. Our embedded sublanguage approach is mod¬ 
ular in the sense that it provides an interface through which 
the programmer can choose which policy framework and 
type system to use for specific security goals. In this pa¬ 
per we sketch one possible policy framework that illustrates 
one particular choice of label lattice, declassification mech¬ 
anism, and support for dynamic policies, but others could 
readily be implemented instead. 

Although we use Haskell’s advanced type system and 
helpful features like the ability to overload syntax, study¬ 
ing how to encode information-flow policies in the context 
of Haskell can point to how similar efforts might be under¬ 
taken in more mainstream languages like Java. Also, since 
the features we use are intended to be “general purpose,” 
they are more likely to find a home in a mainstream lan¬ 
guage than the less widely applicable security types. Evi¬ 
dence of this can be found, for example, in Sun’s recent ad¬ 
dition of parametric polymorphism, a key component of our 
approach, to Java. 

1.2. Overview of technical development 

There are two key technical challenges in embed¬ 
ding a useful security-typed sublanguage in Haskell. The 
first problem is that enforcing information-flow poli¬ 
cies requires static analysis of the control flow graph of 
embedded programs—purely dynamic enforcement mech¬ 
anisms are generally too conservative in practice. The 
second problem is representing the policy informa¬ 
tion itself—depending on the desired model, the policy 


information might be quite complex, perhaps depend¬ 
ing on information available only at run time. 

Our solution to the first problem is to use arrows [5]. 
Intuitively, arrows provide an abstract interface for defin¬ 
ing embedded sublanguages that support standard program¬ 
ming constructs familiar to programmers: sequential com¬ 
position, conditional branches, and loops. Haskell provides 
convenient syntactic sugar for writing programs whose se¬ 
mantics are given by an arrow implementation. 

To address the second problem, we use Haskell’s type- 
class mechanism to give an interface for security lattices. 
Programs written in the embedded language can be param¬ 
eterized with respect to this interface. Moreover, the embed¬ 
ded language can easily be given security-specific features 
such as a declassification operation or run-time representa¬ 
tion of privileges for access-control checks. 

In both cases, we make use of Haskell’s strong type sys¬ 
tem to guarantee that the abstractions enforcing the security 
policies are not violated. This encapsulation means that it is 
not possible to use the full power of the Haskell language to 
circumvent the information-flow checks performed by the 
embedded language, for example. 

Before diving into the details of how the embedded lan¬ 
guage is implemented, it is useful to see how we imag¬ 
ine these techniques being used in practice. The next sec¬ 
tion gives some example programs that illustrates how a 
secure embedded language and the Haskell program can 
be smoothly integrated, explains some background about 
Haskell syntax, and considers the issues with enforcing the 
desired security properties. Section 3 describes the arrow 
interface and explains how it corresponds to the example. 
Section 4 presents the detailed implementation of an arrow- 
based sublanguage for information-flow control. Section 5 
discusses some limitations and caveats with this approach 
and describes some future work. Section 6 concludes. 

2. Example use of the embedded language 

This section presents the features of our secure embed¬ 
ded sublanguage using code examples. We start by encod¬ 
ing a security lattice and use simple program fragments to 
show how information-flow policies can be specified in pro¬ 
grams. Then, we use a larger application to introduce de- 
classification and policy enforcement mechanisms. 

2.1. Encoding Security Lattices 

The security sublanguage provides a generic interface 
for defining security labels as first-class values. A security 
lattice can be implemented by the programmer using the 
Lattice typeclass: 

class (Eq a) => Lattice a where 
label_top : a 

label_bottom :: a 

label_join :: a -> a -> a 

label_meet :: a -> a -> a 

label_leq :: a -> a -> Bool 







The above definitions says values of type a can be used as 
security labels if a supports the usual set of label operations. 
Since we use standard Haskell values to represent security 
labels, there is no limitation on the expressiveness of secu¬ 
rity policies: the programmer has the freedom to choose any 
implementation for security labels. For example, the fol¬ 
lowing code implements a simple security lattice of three 
points: 

data TriLabel = LOW | MEDIUM | HIGH 
instance Lattice TriLabel where 
label_top = HIGH 
label_bottom = LOW 

label_join x y=if x 'label_leq' y then y else x 
label_meet x y=if x 'label_leq' y then x else y 
label_leq LOW _ = True 
label_leq MEDIUM LOW = False 
label_leq MEDIUM _ = True 


We could have implemented a policy language as complex 
as the full decentralized label model [7], but for simplicity 
and ease of presentation, we will use this lattice definition 
throughout the paper. 

2.2. Programming with information-flow policies 

The security sublanguage defines a polymorphic abstract 
data type Protected a to represent a computation of re¬ 
sult type a. Internally, such protected computation is associ¬ 
ated with its information-flow policies, which represent the 
security levels of inputs and outputs. 

The sublanguage provides a set of primitive operations 
for constructing computations of protected types. The con¬ 
structor of the Protected type is held abstract to the 
user program, so protected computations cannot be freely 
opened: they can only be accessed using sublanguage prim¬ 
itives. The simplest operation is pure, which converts a 
standard Haskell computation to a protected computation. 
The sequencing operation »> composes two protected 
computations together by connecting the output of the first 
computation to the input of another. 

By default, protected computations constructed by pure 
have no information-flow constraints. Interesting policies 
can be specified by using the tag operation. The follow¬ 
ing function tag.val takes an arbitrary computation x and 
a security label 1 as inputs, converts x to a protected closure 
using pure, and composes it with an information-flow an¬ 
notation using »> and tag. The output is a protected com¬ 
putation with an output label 1. 

tag_val :: a -> TriLabel -> Protected a 
tag_val x 1 = pure (\_ -> x) »> tag 1 
cH = tag_val 3 HIGH 
cM = tag_val 4 MEDIUM 
cL = tag_val 5 LOW 


Using tag.val, we can define cH, cM, cL as protected 
values with different information-flow policies. The follow¬ 
ing shows some computation using these protected values: 



The liftA2 function is a generic arrow operation that 
can be used in our sublanguage to convert any standard bi¬ 
nary operator to an operator on protected types: tl is the 
sum of cL and cm, t2 is the product of cH and cm. The sub¬ 
language syntax in the definition of t3 is more complex, 
but it suffices to say it represents the computation if cH>3 
then cM else tl. 

The security sublanguage rigorously captures both ex¬ 
plicit and implicit information flows in protected computa¬ 
tions. When the above code is executed, tl will have label 
medium, while t2 and t3 will have label high. The con¬ 
trol flow of the protected computation is represented using 
operations provided by the sublanguage, and these opera¬ 
tions keep track of the information flow policies and con¬ 
straints incrementally during the construction of protected 
computations. 

Any protected computation can be used together with the 
tag operation to restrict the information flow. The follow¬ 
ing function takes a protected computation c as argument 
and requires the output of c to be no higher than MEDIUM: 

expects_medium :: Protected a -> Protected a 
expects_medivim c = c »> tag MEDIUM 

Now, we can use this function with protected computations: 

failure2 = expects_medium t2 
failure3 = expects_medium t3 


The first computation successl is fine, because tl has 
label medium. The second and the third both violate the 
information-flow policies, because t2 and t3 both have la¬ 
bel high while medium is expected: there is information 
flow from HIGH to MEDIUM. 

So, what will happen if policies are violated? Crucially, 
Haskell uses the lazy evaluation strategy by default. None of 
the above computations are actually performed until their 
results are needed. The security sublanguage is designed 
so that protected computations cannot be evaluated unless 
their information-flow policies have been checked by a run¬ 
time mechanism. Therefore, if the information-flow policies 
are violated, none of the protected computation will be per¬ 
formed, but a run-time error will be generated when the pro¬ 
gram tries to use the protected computations. Although this 
is a dynamic information-flow tracking system, it performs 
static analysis on the control-flow graphs of protected com¬ 
putations and provides strong security guarantees. 






2.3. Declassification and code privileges 

Our security sublanguage supports declassifica¬ 
tion mechanisms similar to the one in Jif. Protected 
computation with high security levels can be declas¬ 
sified to low levels using the declassify operation. 
However, the code needs sufficient privilege to per¬ 
form this operation. The sublanguage provides an ab¬ 
stract type Priv to represent the security level of the user 
at run-time. Priv can be passed to functions as a “capa¬ 
bility” argument. The code privilege is needed to verify 
information-flow policies and access protected computa¬ 
tions. 

The sublanguage is designed so that any common val¬ 
ues of type a can be converted to protected values of type 
Protected a. However, protected computations have ab¬ 
stract types and cannot be used outside the sublanguage. 
Whenever we want to access such computation outside the 
sublanguage, we are at an “end-point” of the secure compu¬ 
tation, and we need to certify that the computation satisfies 
its information-flow policies. The cert operation takes a 
protected computation and verifies its security policies: (1) 
the output type must be low, and (2) the current code priv¬ 
ilege must be sufficient to perform the declassification op¬ 
erations in the protected computation. The protected com¬ 
putation is suspended by lazy evaluation, and they will not 
start executing unless the cert operation succeeds. 

2.4. An interactive multi-user application 

We use an interactive application to demonstrate the use 
of these language features. It simulates an online network 
service, where users can log in and access information. 
There are only two kinds of users: guests and administra¬ 
tors. Guests have security level low while administrators 
have security level high. Guests can enter numbers as price 
bids, while the administrator can log in to see the highest 
bid. The information-flow policy is that guests are not al¬ 
lowed to know what the highest bid is. To implement this, 
we maintain the highest bid as a global state stat with se¬ 
curity level high. The following code shows a simple ses¬ 
sion for guest services. 



The guest-service function takes two arguments: 
priv has type Priv and it represents a code privi¬ 
lege passed to this function; stat has type Protected 
int and it is the secret global state. The function re¬ 
turns a new state, which also has type Protected Int. 


The main body of the guest service is written in a stan¬ 
dard IO monad; when it runs, it reads a number from input 
and updates the state if the input i is larger than the pro¬ 
tected state stat. 

The body of the let expression, however, is the com¬ 
putation written in our embedded sublanguage. The sub¬ 
language implements a standardized combinator interface 
called arrows (explained in the next Section), but the imple¬ 
mentation details of these combinators are hidden from the 
programmer. Instead, the programmer uses Haskell’s “do- 
syntax” for arrow computations [8]. The do-syntax provides 
higher-level language control constructs such as variable 
binding and conditional branches, and the Haskell compiler 
translates code in the do-syntax to the standard arrow op¬ 
erations, which are overloaded by our sublanguage using 
Haskell typeclasses. 

In the inner do block, there are two commands. The first 
command “x <- stat -< (); ” binds the value of the se¬ 
cret computation stat to a local variable x, where x has 
type int. Now, x can be freely used in any computa¬ 
tion, but it cannot escape the scope of the do-block. The 
next command “if. .then. .else. .” performs a con¬ 
ditional branching in the sublanguage. The body of the 
branch “returnA -< i” generates the output for the 
do-block. Overall, the computation represented by the 
“proc . . .do” block is bound to the variable stat' and 
it is returned as the result. 


The admin.service implements a similar ses¬ 
sion for administrator services. It has the same type as 

guest_service. 



In contrast to the previous function, admin_service 
uses the combinators provided by the sublanguage directly. 
In the first let expression, it declassifies the protected 
state to LOW level using the declassify operation and 
binds the result to the variable low. The variable low also 
has type Protected Int, but the security level associ¬ 
ated with it is low after the declassification. Then, it uses 
the cert operation, together with its code privilege priv, 
to access the computation protected in low. The result 
summary has an unrestricted int type and thus can be used 
as common Haskell values. The next line calls the standard 
Haskell printing function putStrLn to send this value to 
the program output. Finally, it creates a new protected value 
stat_new initialized to 0, sets the security policy of this 
protected value to be high, and returns it as the new global 
state. 






This function uses several combinator operations pro¬ 
vided by the security sublanguage. In the first let expres¬ 
sion, it uses the declassify operation to create an empty 
computation with the information-flow policy HIGH —> LOW. 
Then, it uses the »> operation to sequentially compose two 
protected computations stat and declassify together. 
Therefore, the result low will have the information-flow 
policy LOW for its output. In the last let expression, it 
uses the pure operation to convert a common Haskell func¬ 
tion to a protected computation. The tag operation creates 
an empty computation with a fixed policy high —> high. 
When pure and tag are composed together using >», it 
creates a new protect computation with security level high. 



The service.loop function is part of the trusted com¬ 
puting base. It authenticate users and dispatch services. On 
every loop, it reads a user name and a password from the in¬ 
put, and authenticates the user. The authenticate func¬ 
tion constructs an abstract code privilege depending on the 
identity of the user. Once a code privilege is available, it is 
used to execute the corresponding service function. 



puting base. It creates an authentication database and an ini¬ 
tial global state, which is tagged by the security level high. 
The combinators pure, tag, >» are similar to those in 
admin.service. 

Security guarantees: 

How does this code enforce our information-flow policies? 
Suppose an untrusted programmer adds the following code 
in guest_service to declassify the secret state: 



Or, suppose the services are incorrectly dispatched: 





In such cases, when the program tries to declassify the data 
and certify the result using the guest privilege, there will 
be a run-time error. The global state is tagged with the la¬ 
bel high, but the guests can only acquire low privileges 
during the authentication process. The declassification op¬ 
eration requires that the user privilege must be higher than 
the security level of the data to be declassified. Therefore, 
the guest cannot declassify the global state to low and use 
cert to steal the secret state. This provides a security mech¬ 
anism similar to that of using run-time principals [14]. 

Aside from the authentication process and initial 
setup of confidential state, the information-flow poli¬ 
cies are automatically enforced throughout the system. The 
guest_service and the admin_service are very sim¬ 
ple in this example, but they can be scaled to more complex 
services, system states and security policies. 

Benefits: 

As we have seen, the application program is a fine-grained 
mixture of normal components written in standard Haskell 
and secure components constructed using a few special op¬ 
erations from the sublanguage. To those unfamiliar with 
Haskell, it may be hard to distinguish where the “embed¬ 
ded” language ends and the “base” language begins, but 
that is part of the point—the programmer has easy access 
to both the strong security guarantees of the embedded lan¬ 
guage and the full power of Haskell at the same time: all 
the Haskell language features and software libraries are still 
available. 

3. The arrows interface 

As described in the introduction, arrows can be thought 
of as an interface, called a type class in Haskell parlance, 
for defining embedded sublanguages. Haskell actually pro¬ 
vides several related type classes that each provide refine¬ 
ments to the basic arrow functionality. 1 

The following code specifies the simplest Arrow type 
class. Here, a is an abstract type of arrows with input type b 
and output type c. This arrow type class supports only three 
operations: pure, (>»), and first. 

class Arrow a where 

(>») :: abc->acd->abd 

first : : a b c -> a (b, d) (c, d) 


1 The related references [5, 9, 8] in the bibliography can be used for 
more detailed studies of arrows. 











Basic blocks and compositions 

Intuitively, the pure operation lifts a Haskell function 
of type b -> c into the arrow; such lifted functions serve 
as the “basic blocks” of the control-flow graphs constructed 
via arrow combinators. The infix operation >» provides 
sequential (horizontal) composition of computations, and 
first provides parallel (vertical) composition of compu¬ 
tations. 

An instance of the Arrow type class is required to satisfy 
a set of axioms that specify coherence properties between 
the operations. For example, »> is required to be associa¬ 
tive. We omit the complete description of the arrow axioms 
here; for our purposes, there is only one interesting case to 
consider, and it is discussed in Section 5. 

The simplest instance of Arrow is Haskell’s function ar¬ 
row constructor (->) itself: every function of type b -> c 
is also an arrow computation (->) be. 

Representing conditionals and loops 

The basic Arrow interface does not provide the ability to 
construct conditional computations—it can construct only 
control-flow graphs that represent straight-line code with 
no branches. Two other type classes refine arrows by per¬ 
mitting conditional branches and loops. 

The ArrowChoice type class provides an opera¬ 
tion called left that extends the Arrow interface with 
the ability to perform a one-sided branch computation de¬ 
pending on the arrow’s input value. In Haskell, the type 
Either b d describes a value that is a tagged union or op¬ 
tion type. 

class Arrow a => ArrowChoice a where 
left :: a b c -> a (Either b d) (Either c d) 


Using left and the other arrow primitives, the following 
operations can be implemented to construct different kinds 
of conditional computations: 

right :: a b c -> a (Either d b) (Either d c) 

(+++) :: a b c -> a b' c'->a (Either b b') (Either c c') 
(III) ::abd->acd->a (Either b c) d 

ArrowLoop provides the embedded language with a 
loop construct sufficient for encoding while and for loops. 
Intuitively, the loop operator feeds the d output of the ar¬ 
row back into the d input of the arrow, introducing a cycle 
in the control-flow graph: 

loop : : a (b, d) (c, d) -> a b c 

The benefit of having this operation is that recursive 
computations can be constructed as a finite combination of 
arrow components. We will use this property in Sec. 4.5. 

Translating the do-syntax 

Programming directly with the arrow operations is some¬ 
times cumbersome, because arrows require a point-free pro¬ 
gramming style. The do-syntax for arrows [8] provides syn¬ 


tactic sugar for arrow programming, such as arrow abstrac¬ 
tion, arrow application, sequential composition, conditional 
branching and recursion. Internally, the Haskell compiler 2 
translates the do-syntax used in the embedded sublanguage 
into the basic arrow operations. For example, conditional 
statements are translated to pure, »> and | | | operators, 
using the following rule: 

[ proc p -> if e then cl else c2 J = 

pure (\p -> if e then Left p else Right p) 

»> [ proc P -> cl 1 III [ proc p -> c2 ] 

This rule translates the if command in the sublan¬ 
guage. The if construct in the translated code is the con¬ 
ditional expression in the base Haskell language. The 
sublanguage syntax “proc p->” provides an arrow ab¬ 
straction that binds the arrow input to the variable p. 

The compiler is able to resolve this syntax overloading 
by using type information: the type Protected mentioned 
in the example code informs the compiler to use the def¬ 
inition of | | | given by our FlowArrow instance of the 
ArrowChoice type class (described below). 

3.1. Control flow in arrow sublanguages 

The Haskell programming language itself provides 
branching, looping, and other control-flow constructs, so 
one might wonder why it is necessary to re-implement 
all of these features in the embedded sublanguage. Com¬ 
pared to Haskell’s full control-flow mechanisms (which 
also include function calls and exceptions, for exam¬ 
ple), the arrow type classes are actually quite impov¬ 
erished. The arrow interfaces isolate the base language 
(Haskell) and the sublanguage (arrows): by design, the con¬ 
trol flow constructs in the base language cannot be 
directly used to represent the control flow of the sublan¬ 
guage. 

This separation property is crucial for the security anal¬ 
ysis of arrow-based sublanguages. If an arrow implements 
only the operations in the ArrowChoice type class, con¬ 
ditional branches on arrow computations can only be im¬ 
plemented using the given arrow operations left, right, 
(+++), and (| | |). By keeping the arrow implementation 
abstract, the programmer is forced to use these arrow oper¬ 
ations for writing conditional branches, because there is no 
other way to manipulate the interface. 

Therefore, by designing the arrow interface with limited 
control-flow primitives, the control-flow graph of an arrow 
computation is determined by the composition of primitive 
arrow operations. In other words, arrows can force compu¬ 
tations to be constructed with static and explicit control- 
flow structures. This makes it possible to completely ana¬ 
lyze the information flow in an arrow-based sublanguage 


2 We use the Glasgow Haskell Compiler [1], 









before running the computation. A more permissive inter¬ 
face to the sublanguage (such as provided by monads for 
example) would allow base language branches to leak in¬ 
formation about supposedly protected data. 

4. An embedded language for information- 
fbw control 

This section presents the design of our secure embedded 
sublanguage using the arrows interface. The design uses the 
structure of arrow transformers, which allows arrow sub¬ 
languages to be composed in a modular, layered fashion. 

4.1. Encoding flow types and constraints 

The design in this paper assumes arrow computations are 
purely functional and uses a simple information-flow type 
system for arrow computations. A computation has an input 
security label h and an output security label h- The typ¬ 
ing judgment has the form 

$ b c-.h^h 

where c is a purely functional computation, l\ —> I2 is the 
flow type assigned to c, and $ is a list of label constraints. 
The type system is presented in Figure 1. 

The sublanguage flow type is encoded using the Haskell 
Flow data type: 

(data Flow 1 = Trans 1 1 | Flat_ J 


1. Trans h h specifies a security type h —> h- 

2. Flat means the input and output can be given the same 
arbitrary label. It specifies a security type l —* l, where 
the label l can be determined by constraints in the con¬ 
text. 

The label constraints are encoded using the Constraint 
data type: 

data Constraint 1 = LEQ 1 1 | USERGEQ 1 


1. leq li I2 represents a direct ordering between two la¬ 
bels: l1 C l 2 . 


4.2. Encoding typing judgments and rules 


The abstract datatype FlowArrow defines our secure 
embedded language by implementing the arrow interfaces 
described above: 



A value of type FlowArrow 1 a b c is a record with 
three fields. The computation field encapsulates an arrow 
of type a b c that is the underlying computation. The flow 
field specifies the security levels for the input and output 
of the computation. The constraints field stores the list 
of flow constraints $ when the arrow computation is con¬ 
structed from smaller components. 

FlowArrow encodes an information-flow typing judg¬ 
ment for an effect-free arrow computation, using the encod¬ 
ing of flow types and constraints we just defined. The typ¬ 
ing judgment $ b c : l\ —> 12 is represented by the value: 


FAc (Trans l\ h)® 


FlowArrow uses a generic design of arrow transformers 
and it is parameterized by many types: 

1. The type l of security labels. (FlowArrow l) is an ar¬ 
row transformer. 

2. The effect-free arrow a we are transforming from. The 
simplest and most common case of a is the function ar¬ 
row (->). The result (FlowArrow 1 a) is also an ar- 


3. The input type b and output type c. 

The arrow a must have no side-effects. Although 
FlowArrow is a generic arrow transformer, we do re¬ 
quire that the arrow a represent a purely functional compu¬ 
tation where information flows from one end to the other, 
so the information-flow types in the form of l\ —> I2 will 
make sense. For the rest of the paper, the reader can as¬ 
sume a is the function arrow (->) for ease of understand¬ 
ing. The type (Protected a) in Sec. 2 is actually an 
abbreviation for a FlowArrow type: 


2. USERGEQ l represents the constraint l ft user. It re¬ 
quires that the run-time code privilege, which is repre¬ 
sented as a label, be at least l. This will be used in Sec. 
4.4 when we implement declassification. 

The purpose of using constraints $ is to implement late 
binding of the security lattice. The type system collects the 
label constraints when secure computations are constructed 
from individual components. Such constraints are checked 
when the secure computation is accessed through the pol¬ 
icy enforcement mechanism, namely, the cert operation. 
This design makes it possible to use dynamic security lat¬ 
tices and also helps when implementing declassification. 


type Protected a = FlowArrow TriLabel (->) () a 

(FlowArrow 1 a) is an arrow, so we can use FlowArrow 
as a sublanguage to represent computation. At the same 
time, (FlowArrow 1 a) also encodes a typing judgment, 
so we can verify the information-flow policies for the com¬ 
putation. Essentially, the implementation of (FlowArrow 
1 a) is a type-checker: each arrow operation implements 
a typing rule for that operation. For standard arrow opera¬ 
tions on a, FlowArrow lifts them by (1) running the origi¬ 
nal operation on the computation fields of arguments, and 
(2) computing the flow types and constraints using such in¬ 
formation from arguments. 
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The implementation of FlowArrow is given in Figure 2. 
In the definition of each arrow operation, the operation in 
FlowArrow (on the left hand side) is implemented using 
operations in the arrow a (on the right hand side). 

The pure operation returns a Flat flow type and no con¬ 
straints, because such computations have no information- 
flow policies on them. It implements the PURE typing rule. 
Flat represents a flow type l —* l, but the label l never ap¬ 
pears in the implementation—it can always be inferred from 
context. 

The (»>) operation sequentially composes two arrow 
computations. The flow types and constraints are computed 
using the flow_seq function, which implements the SEQ 
typing rule. 

The first and left operations implement the ONE typ¬ 
ing rule. The (&&&), (***), (III) and (+++) operations 
are parallel compositions and they implement the par rule. 
The f low.par function is used to compute parallel compo¬ 
sition of flow types. 

The loop operation is slightly more interesting. It imple¬ 
ments the LOOP typing rule. Since loop connects the out- 


constraint_loop Flat = [] 

constraint_loop (Trans 11 12) = [LEQ 12 11] 



flow_seq Flat f2 = (f2,[]) 
flow_seq fl Flat = (fl,[]) 
flow_par :: (Lattice 1)=>Flow l->Flow 1 
flow_par (Trans 11 12) (Trans 13 14) = 



Figure 2: Implementation of arrow operations 


put of a computation back to its input, we generate a con¬ 
straint to capture this information flow. This requires that 
the input and the output have the same security level, un¬ 
less there is declassification inside the computation. 

As described in Section 3, the do-syntax of the secure 
sublanguage is translated to these standard arrow opera¬ 
tions. Therefore, the typing judgment for code written in the 
do-syntax can be derived by combining the typing rules im¬ 
plemented in these standard arrow operations. For the trans- 





lation of conditional commands shown in the end of Sec. 3, 
the combination of typing rules yields essentially the same 
constraints as the following COND rule found in conven¬ 
tional information-flow type systems: 

r h ei : h r I -e 2 :h 

r b e 3 : Z 3 h E h LI I3 

- COND 

r b if ei then e-i else e 3 : I2 LI I3 

It is by this property that the implicit information flow 
in the t3 example in Sec. 2.2 can be captured by our type 
system. 

4.3. Policy specification 

So far, the typing rules implemented in FlowArrow only 
construct computations from smaller components while 
composing their information-flow policies. Pure computa¬ 
tions are given the l —* l flow type, but we need a way to in¬ 
troduce more interesting flow types. Recall from Sec. 2 that 
the tag operation annotates a computation with a secu¬ 
rity label. It implements the TAG rule in Figure 1. 



When tag is applied to a label l, it creates an arrow 
that represents an empty step of computation, with the flow 
type l —> l. Intuitively, tag inserts a “pipe” in the mid¬ 
dle of the computation, with explicit flow types specified on 
both ends. For example, to annotate the confidential value 
secret with label HIGH, we can use the following code 
which has a flow type high —> high: 


To assert that the output of a computation c has no confi¬ 
dential information, we can simply use the following code 
to connect c to a “low” pipe: 


For confidentiality policies, we care about the future of se¬ 
cret computation, i.e. where information will flow to. There¬ 
fore, a good pattern for protecting confidentiality is to ap¬ 
pend tag to the output of the computation we want to pro¬ 
tect. The design of tag and the arrow types are completely 
symmetrical. If we have labels for integrity policies, we can 
connect the output of tag to computations that require trust¬ 
worthy data as inputs. This is a bi-directional design that 
works for both confidentiality and integrity policies. 

4.4. Declassification 

Declassification is a practical requirement for language- 
based information-flow control. We need a mechanism to 
allow information flow from high levels to low levels, but 
only in controlled ways. The decentralized label model 


(DLM) [7] solves this problem by assigning code with au¬ 
thority. Each declassification statement can only weaken the 
security policy that belongs to the authority of code. In our 
arrows framework, there is no difficulty of encoding the la¬ 
bels in DLM, but we need a declassification mechanism that 
takes code authority into account. 

For simplicity, we designed the declassification construct 
and its corresponding flow constraints for simple lattices 
such as the TriLabel lattice implemented in Sec. 2. It is 
simpler than the declassification mechanism in Jif, but it im¬ 
plements the essence of code authority checking. 



The declassify operation implements the DECL rule 
in Figure 1. It is similar to the tag operation except that it 
constructs a “pipe” where the security level of the output 
is lower than the level of input. Similar to other operations, 
declassify does not check the policies directly, but it cre¬ 
ates a constraint which can be checked later. When applied 
to two label values, declassify h I2 creates an arrow with 
flow type h —* I2 and a flow constraint usergeq l\ stat¬ 
ing that the code privilege must be at least l\. For example, 
if the code privilege is high, it can declassify information 
from medium to low. But if the code privilege is medium, 
it cannot declassify from high to low. 

4.5. Policy enforcement 

Finally, we need to check the flow types and constraints 
that we have accumulated during the construction of a se¬ 
cure computation. Since we have a declassification mecha¬ 
nism which takes code privilege into account, the code priv¬ 
ilege must be provided to check the constraints. 



The judgment L, l US er c : U n —> l ou t, states the se¬ 
curity property to be checked: given a security lattice L 
and the label l use r representing the code privilege, does 
the arrow computation c satisfy the information-flow pol¬ 
icy lin —> /out? The cert rule in Figure 1 checks this se¬ 
curity property and the certify function implements this 
rule. 

The certify operation takes a few arguments: 

1. The information-flow types li n and l out that we expect 
the computation to have. Suppose the flow type of the 








secure computation is h —► h, certify calls another 
function to verify that l ln C l-\ and 1 2 C Z out . 

2. The code privilege Z„ ser , under which the computation 
is performed. The certify operation checks all the 
constraints that come with the secure computation. For 
any constraints of the form usergeq l, a check is per¬ 
formed to make sure that l C l U ser . If any constraint is 
not satisfied, a run-time error is generated. 

3. A FlowArrow value that includes the secure com¬ 
putation to be checked. If all the above checks are 
successful, the embedded secure computation is re¬ 
turned. Note that we stacked the arrow transformer 
FlowArrow on another arrow a, this certify oper¬ 
ation strips FlowArrow off and gives back computa¬ 
tions in arrow a. 

Although certify is a dynamic enforcement mecha¬ 
nism (it executes as part of the Haskell program), it pro¬ 
vides strong security guarantees. When an embedded com¬ 
putation is certified, its whole control structure is examined 
using an information-flow analysis before any part of em¬ 
bedded computation is performed. This process is like type 
checking the embedded sublanguage. 

Branches over arrow computations can only be con¬ 
structed using the operators provided in FlowArrow. 3 The 
control structure of a secure computation is independent of 
the values generated in the computation. Therefore, if the 
run-time check fails, the failure does not leak information 
about secrets inside the computation. 

A minor caveat is that recursive arrow computations 
should be constructed using the loop operation rather than 
using standard Haskell recursion. The certify function 
checks the flow types and constraints of the whole compu¬ 
tation, so it forces evaluation of all arrow operations used to 
construct the computation. If the computation is recursively 
constructed using standard Haskell recursion, certify will 
essentially try to check an infinite control flow graph with 
an infinite typing derivation, which will exhaust Haskell’s 
stack space and eventually abort the program. In such cases 
secret information is not leaked, but the secure computa¬ 
tion should be re-written using the loop operation. 

The certify interface seems verbose, but it is very 
flexible to use. For protecting confidentiality, we only care 
about the security level of the output, so the argument 
can always be label-bottom. We also require all secrets 
be declassified to the lowest security level before reaching 
the output channel, so we let the argument l ou t always be 
label-bottom. Thus, we hide the definition of certify 
and define a simpler operation cert: 

cert = certify label_bottom label_bottom 


3 Importantly, FlowArrow does not implement a richer interface 
such as the ArrowApply type class that would make it impossi¬ 
ble to analyze the control structure. 


4,6. Code privileges 

When using certify, it is important that the code 
privilege is correctly specified: untrusted code cannot call 
certify using a code privilege that it does not have. Our 
solution is to define an abstract data type Privilege that 
internally stores a label as code privilege. The certify op¬ 
eration takes values of the abstract type Privilege as its 
input. 

data Privilege 1 = PR 1 


The key point is to make the constructor pr only avail¬ 
able in trusted modules. The program must be organized 
such that untrusted code can only treat the Privilege type 
abstractly. Privileges can only be created in trusted code and 
passed to untrusted code as shown in Sec. 2, where the type 
Priv is an abbreviation: 

type Priv = Privilege TriLabel 

Developing appropriate design patterns for structuring priv¬ 
ileged code is an important task that we leave to future 
work. An interesting question as with all capability-based 
authorization mechanisms is how to revoke the privileges 
passed to untrusted code. If the untrusted code has state and 
runs under several privileges in different places, it can steal 
privileges by storing and reusing them. One solution is to 
encode version numbers in such privileges and have a global 
state to indicate valid privileges, doing so would require the 
top level code be inside a monad. 

5. Discussion and future work 
5.1. Compile time vs. run time 

For applications written using the embedded language, 
there are two stages of type-checking. At compile time, 
the base language (in this case, Haskell) is type checked 
and compiled. At run time, the embedded language is type 
checked before embedded secure computations are exe¬ 
cuted. Therefore, the information-flow policy violations are 
not detected until the application is launched. Although the 
sublanguage uses static analysis techniques and provides 
similar strong security guarantees, this two-stage mecha¬ 
nism is sometimes not as convenient as specialized lan¬ 
guages such as Jif. Each run of the application may only 
use part of the secure computations, so debugging can be 
more difficult. Therefore, it is appealing to have a one-stage, 
compile-time enforcement mechanism for the sublanguage. 
Such a mechanism can be possible if it is build entirely in 
the static type system of the host language. 

Abadi et. al. developed the dependency core calculus 
(DCC) [2], which uses a hierarchy of monads to model in¬ 
formation flow. Tse and Zdancewic [15] translated DCC to 
System F and showed that noninterference can be trans¬ 
lated to a more generic property called parametricity, which 




states that polymorphic programs behave uniformly for all 
their instantiations. An intuitive demonstration of this idea 
is that abstract data types can be used as a protection mech¬ 
anism to hide high-security information. They presented a 
Haskell implementation where each security level is en¬ 
coded using an abstract data type and binding operators are 
defined to compose computations with permitted informa¬ 
tion flows. This approach works well for simple lattices, 
but encoding the security lattice of n points would require 
0(n 2 ) definitions for binding operators. This makes it dif¬ 
ficult to implement more complex security lattices such as 
the decentralized label model. 

The problem with this approach is policy expressiveness. 
The type system of the base language must be expressive 
enough to encode the syntax and the semantics of security 
policies. Although Haskell has an expressive type system, 
it is not clear how to encode more expressive policies di¬ 
rectly in the type language — we leave that as a open ques¬ 
tion to investigate in the future. 

5.2. Parallel composition and arrow axioms 

We used the arrow interface to build the embedded lan¬ 
guage, but, does FlowArrow satisfy all the arrow axioms? 
A quick check of the arrow axioms [9] shows that the ex¬ 
change axiom does not seem to hold. Let / have the flow 
type Zi —> 1.2 and g have the flow type Is —> h- There are 
two canonical ways to compose / and g in parallel using 
the first combinator. They should be equivalent: 

first / >» pure (id X g) = pure (id X g) »> first / 

However, our FlowArrow implementation yields the flow 
type l\ —► /,j with constraints {I2 t; Z 3 } on the left side and 
I3 —> I2 with {14 C Zi} on the right. This seems to violate 
the arrow axioms! Does our implementation make sense? 

If we compose / and g naturally using the (***) oper¬ 
ator, we get l\ n Is —■► h LI U with no constraints, which 
is the least restrictive flow type. The types we get from 
using first are both more restrictive than this one. The 
problem with using first is that our analysis technique is 
not fine-grained enough—it reasons about information flow 
in a syntax-directed, end-to-end fashion that yields impre¬ 
cise flow types for first / »> pure (id x g). This 
coarse analysis does not compromise the security guaran¬ 
tee because it always conservative. 

To justify that our arrow implementation satisfy the ar¬ 
row axioms, we would need to give finer semantic inter¬ 
pretations to the flow types and constraints. Intuitively, al¬ 
though both sides of the exchange axiom are over-restrictive 
and have different types, they can be considered equivalent 
in the sense that they are both sound: using such flow types 
will not lead to acceptance of insecure programs. 

The practical ramification of this imprecision is that al¬ 
though soundness is not affected by using first, program¬ 
mers are encouraged to use the (&&&), (***), (| | |), 


(+++) operations directly so that safe programs are more 
likely to be accepted. The use of first, second, left, 
right should be avoided whenever possible. Compared 
to first, (&&&) is a more intuitive operation for paral¬ 
lel composition because it resembles the product morphism 
in category theory. The prior work on arrows [5] uses first 
as the primitive operation because it is simpler and it gives 
definite evaluation orders. 

A consequence of this imprecision is that the security 
analysis can be too restrictive for arrow computations writ¬ 
ten in the do-syntax, because Haskell implements some 
translation rules using first instead of (&&&). Fortu¬ 
nately, conditional branches are translated using (| | |), so 
programmers can still write conditional branches in the nat¬ 
ural way. In general, we need a more precise type system to 
avoid depending on particular implementations of the trans¬ 
lation rules of the do-syntax. 

5.3. DLM and practical applications 

The declassification mechanism in this paper can 
be adapted to work with the decentralized label model 
(DLM) [7], where the constraints on code author¬ 
ity are expressed using the act-for relation of principals. 
We are currently working on the encoding and inte¬ 
gration of DLM in the arrows framework. Unlike Jif, 
where information-flow control are mostly-static, the ar¬ 
rows framework is a run-time mechanism, so the principals, 
the act-for hierarchy and the security lattice can all be dy¬ 
namic. Such dynamic policies have long been sought in 
language-based information-flow security because they ad¬ 
dress practical requirements. 

Once the dynamic DLM is implemented, it will be in¬ 
teresting to see how it works in real applications. An im¬ 
portant benefit of our approach is that existing Haskell ap¬ 
plications can be enhanced with information-flow control 
without complete rewriting. The programmers may proceed 
gradually by changing the representation of secure compo¬ 
nents while leaving most normal components untouched. It 
would be ideal if the security-sensitive computation only 
takes a small portion of the whole program, so information- 
flow policies can be globally enforced by a few local modi¬ 
fications to the program. 

As mentioned in Sec. 4.6, there are still interesting open 
questions on the protection and revocation of code privi¬ 
leges. The dynamic checking makes debugging more diffi¬ 
cult because run-time errors are hard to observe, reproduce 
and locate. All these problems need to be explored in the 
contexts of concrete applications. 

5.4. Implementing other type systems 

Although FlowArrow is a generic arrow transformer, the 
type system implemented in FlowArrow only works with 
arrows that have no side-effects, because we assign a simple 



information-flow type l\ —* 1 2 to such arrow computations. 
This brings up two questions. First, what arrows can be used 
besides the function arrow (->) ? The stream processor [5] 
is an example: we can use FlowArrow to track informa¬ 
tion flow for stream processors, which map input streams 
to output streams. But in general, we need to formally state 
the properties of arrows that can be used with FlowArrow. 
Another question is: how can we modify the type system 
in FlowArrow so that it could work for states and other 
side-effects? For a specific effect such as memory states, 
we can extend the type system implemented in FlowArrow 
and lift the special arrow operations such as get and put to 
FlowArrow while implementing appropriate typing rules. 
By designing multiple versions of FlowArrow, we can im¬ 
plement multiple security type systems, and they can be 
used in one application at the same time. 

The technical development in this paper is informal, al¬ 
though we have implemented it in Haskell. The type system 
implemented in FlowArrow can easily be justified follow¬ 
ing standard techniques [16]. For more complex type sys¬ 
tems, however, justifying correctness often requires formal 
reasoning. The security goal is usually formalized as nonin¬ 
terference properties and the soundness is proved in a lan¬ 
guage with well-defined formal semantics. To use such a 
type system in the arrow framework, we must verify that 
the semantics of the arrow sublanguage matches the seman¬ 
tics of the toy languages used in the type soundness proofs. 

6. Conclusion 

Using an embedded sublanguage of arrows, end-to-end 
information-flow policies can be directly encoded and en¬ 
forced in Haskell using modular library extensions, with a 
modest overhead of run-time checking. There is no need to 
modify the Haskell language, and this embedded sublan¬ 
guage approach permits the information-flow technology to 
be adopted gradually. The security mechanism is designed 
to be generic with respect to computations types and secu¬ 
rity lattices. There is great flexibility in the choice of secu¬ 
rity policy frameworks; multiple policy frameworks can co¬ 
exist in the same program. Dynamic information-flow poli¬ 
cies can be expressed, yet the security guarantee is as strong 
as that of static analysis. 
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